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Abstract 

Let P and Q be two probability distributions which differ only for 
values with probability at least p > 0. We show that the variational 
distance 8(P n ,Q n ) between the n-fold product distributions P n and 
Q n is upper bounded by y/nj (2p) S(P, Q), i.e., it cannot grow faster 
than the square root of n. 

1 Preliminaries 

Let P be a probability distribution with range Z and let n £ N. We denote 
by P n the n-fold product distribution, that is, 

n 

P n (z 1 ,...,z n )=l[P(z l ) 

1=1 

for any z±, . . . , z n G Z. Note that P n describes n independently repeated 
random experiments with distribution P. 

The variational distance between two probability distributions P and Q 
with range Z is defined as 1 

6(P,Q) :=±J2\ P (z)-Q(z)\ ■ 

zez 

Note that 5 is a distance measure on the set of probability distributions with 
range Z. In particular, 5 is symmetric, 5(P,Q) = if and only if P = Q, 
and the triangle inequality 

S(P,Q) <5(P,P') + 5(P',Q) (1) 

See, e.g., JTJ. 8(-,-) is also called statistical difference 4, Kolmogorov distance, or 
trace distance 0- 



holds. 



2 Main Result and Proof 

2.1 Upper Bounds for the Variational Distance 

Let P and Q be two probability distributions with range Z. It is a di- 
rect consequence of the triangle inequality Q that the variational distance 
S(P n , Q n ) between the n-fold product distributions P n and Q n cannot grow 
faster than linearly in n, i.e., 

5(P n ,Q n )<n5(P,Q) . (2) 

Moreover, it is easy to find examples where this inequality is almost tight. 
Let, e.g., P and Q be two binary distributions with range Z = {0, 1} such 
that P(l) = and Q(l) = £ for some e > 0. If ne <C 1 then the variational 
distance 5(P n , Q n ) is roughly equal to nd(P, Q) = ne. 

However, the upper bound (J2J) can only be close to optimal if, for some 
element zG2, the relative difference \P(z) — Q(z)\/(P(z) + Q(z)) between 
the probabilities is large. (Note that, in the above example, this relative 
difference for z = 1 equals one.) Indeed, the following result states that, in 
all other cases, 5(P n , Q n ) cannot grow more than the square root of n. 

Lemma 2.1. Let P and Q be two probability distributions with range Z, let 
V := {z £ Z : P(z) / Q(z)} be the subset of Z where P and Q differ, and 
let p := 'mf Z £-£)(mm(P(z), Q(z))). If p > then, for any n £ N, 

S(P n ,Q n ) < \f^-\fn + ls(P,Q) 

V ftp V p 

and 

5{P n ,Q n )<^5{P,Q) • 

The first bound of Lemma [2. II is optimal in the sense that, for any p > 0, 
there are probability distributions P and Q with minimum probability p such 
that the quotient between the left and the right hand side of the inequality 
approaches one for increasing n (as long as 5(P n ,Q n ) 1). On the other 
hand, the constant ^ in the second bound is the smallest value of c such 
that 5{P n ,Q n ) < y4^/p5(P,Q) always holds. 

The result of Lemma l2.1l is also complementary to a lower bound derived 
in 0], which states that, if the maximum probabilities of P and Q are small 
enough and if n is not too large, then 5(P n , Q n ) > ^l(^/n)5(P, Q). 
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2.2 Proof of Lemma I2TT1 

To prove Lemma f2.ll we consider a probability distribution Pt parameterized 
by some real value t and compute a bound on the derivative, with respect 
to t, of the variational distance S(P[ l , P t n ). 

Lemma 2.2. Let {Pt}t£R be a family of probability distributions with range 
Z parameterized by t G R, let to G R, and Zet z±, Z2 G 2 such that 2 



and P t {z) = P to (z) for any z G Z\{z 1 ,z 2 }. If p := Pt {z\) and p' := Pt (z 2 ) 
are nonzero then 



Proof. Assume without loss of generality that to = 0. Since Pt(z) does not 
depend on t for z G Z\{zi, z 2 }, we have Pt(zi) + Pt(z 2 ) = P + p', and thus, 
by the definition of the variational distance, 





and 





n 



k 



where 




and 





of the function futt — to- 
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Moreover, it follows from (|TT|) that a! k r (0) > if and only if r < r := |_^^7 
We thus conclude 



n r v 

Aw^U =£*(*)£ L k 

fc=0 r=0 V y 



(0) • 



(3) 



Let a 



be rewritten as 



—2—t and /3 := -?-r. The last sum in the above expression can then 

p+p ' p+p 1 



SCK (0, ^§[C) 

f 



(k - r)a r (3 



r ak—r—l 



k 



r— 1 nk—r 



p + p' 



*—n \ / r=0 



a r /3' 



fc— ? 1 



r=0 

A; A - 1 



P + P 1 V r 



r /pfc— r— 1 



With the definition a := §±0 := (fc+1) ~ (f+1) 



we have 



(4) 



fe- 1 



r ofc— r— 1 



a r /3 



fc+i' ^ • fc+i 
fe + 1\ (f + l)((k + 1) - (f + 1)) a^+ 1 /3( fc + 1 )-(^+ 1 ) 



f + 1 



fc(fe + 1) 



aj3 



< 



< 



k + 1\ (f + l)((fe + 1) - (f + 1)) a f+1 ^( fe+1 )-( f+1 ) 



f + 1 



1 



2vr(fe + l)a/3 



fe(fe + l) a/? 

(f +l)((fe + l)-(f +1)) 
fe(fe + l)ap 



1 / fe+1 /a/3 
feU 2iraf3 ' 



(5) 



where the first inequality follows from and the second from Lemma fA.ll 
Using the definition of f and letting 7 := ka — [ka\ , the expression in the 
second square root of the last term can be bounded by 

a(3 [ka\ +1 (k + 1) - ( [ka\ +1) ka + (1 - 7) fe/3 + 7 



a/3 a(fc + 1) 



/3(fe + l) 



a(k + l) 0(k + l) 
k + l^L k + 1 k + 



3 < ' min(a,/3) ^ 



fe + 1 fe + 1 ~ fe + 1 

he fact that = and 
both be larger than one, since f3, 7 S [0, 1]. Combining this with © and 



where the last inequality follows from the fact that = j— | and % cannot 
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we find 



t (*) <m = ^ (V ) ^ M < - w . (7) 



where 



s(k)~ 1 ./ /C+m W) 



p + p' y 27ra/3 
Alternatively, the left hand side of (JJJ) can be upper bounded by 



To see this, assume first that ak > 2 and > 2. Then . j R , < £ which 



mm 



implies 



i^ay - 2 



s(k) < -J—W-l^- < 5(fc) . 
p + p'V 2vra/3 



On the other hand, if afc < 2 or /% < 2, the bound follows from a straight- 
forward calculation using (|12[). (In this case, f or (A; — 1) — f is either or 
1, i.e., the binomial in (J7| equals 1 or k — 1.) 

When (JHJ) is combined with 0, we obtain 

n 

iw.^)| w <E(f(*M*) • ( 9 ) 

fc=0 

Note that s(/c) is a concave function in fc and that q(k) are the probabilities 
of a binomial distribution with mean n{p + p'), that is, X^fc=o ?(^) = anc ^ 
Sfc=o^(^)^ = We can thus a PPly Jensen's inequality to find an 

upper bound for the sum on the right hand side of Q, i.e., 

n n 

-^(P t «,P % =0 < < a(£q(k)k) = s{n(p + p')) , 

fc=0 fc=o 

from which the first inequality of the lemma follows. 

Similarly, because s(k) is concave in k as well, we have 

^S(P t n ,P^\ t=Q <s(n(p + p')) , 

which implies the second inequality of the lemma. □ 
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We now use the bounds provided by Lemma [2.2l to prove our main result. 

Proof of Lemma \2.1\ We first prove the first inequality of the lemma for the 
special case where the probabilities P and Q only differ for two values z\ 
and Z2- Assume (without loss of generality) that P(z\) < Q{z{) and let 
p := P(z±), p' := P(z2), q '■= Q{z\). For t £ [p,q], let P t be the distribution 
with range Z given by 

„ Q — t „ t — p „ 

Pt := P+ -Q ■ 

q — p q — p 

i.e., P t {zi) = t, P t {z 2 ) = p + p' - t, and P t {z) = P(z) = Q(z) for any 
z £ Z\{z\,Z2\. We can thus apply the first inequality of Lemma 12.21 which 
gives 



£s(p^pn\ s=t < t + v+pl _ t J n+ min (t,p + p>-t) 



< v^y p v + p ' 

Using Lemma lA.2| we obtain 

5{p\q") = 8{p;^) < J" £5(p?,pn\ s=t dt < h-p)^Jl ^f+j- 

Since q — p = 5(P, Q), this concludes the proof of the first inequality of the 
lemma for the special case where the probabilities P and Q differ for at most 
two values in Z. 

To prove the general case, we first observe that if the set T> is infinite, 
then p equals zero and nothing has to be proven. On the other hand, if T> 
is finite, it is easy to see that there exists a sequence (Pi)i=i,..., m (for some 
m £ N) of distributions with range Z such that 

• Pi = P and P m = Q, 

• for any i £ {1, . . . ,m}, the distributions Pi and Pj+i differ only for 
two elements in T>, 

• min 2g x) Pi{z) > p, for all i € {1, ... , m}, and 

• ETJi l s(Pi,Pi + i) = 5(Pt,p m ). 
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The general assertion then follows directly from the special case proven 
above and the triangle inequality (^Q), i.e., 



TO— 1 TO— 1 



, irp v P 



'i_ n + l S (P,Q) . 

Tip V P 

The second inequality follows by exactly the same reasoning based on 
the second inequality of Lemma 12.21 □ 

A Appendix: Some Useful Identities 

A.l An Upper Bound for the Binomial Coefficient 
Lemma A.l. For any n > k > 0, 



\ / 7 \ k / 7 \ n—k : 

n\ i k\ / n — k\ / n 

< 



K k J \n J \ n J y 27rA;(n — k) 

Proof. The assertion follows directly from Stirling's approximation |2j, 

^ n n+l/2 e -n+l/(12n+l) < n , < ^n+l^-n+l/^n) 
for n > 0. □ 

A. 2 Bounding the Variational Distance by Its Path Integral 

Lemma A. 2. Let a <b, let {Pt}t^R be a family of probability distributions 
parameterized by t G [a,b], and let f(t) := -£^5(P S , P t ) | be the right 
derivative of the variational distance. Then 

5(P b ,Pa) < I" f(t)dt . 
J a 

Proof. Since equality holds for b = a, it suffices to verify that 

^s(p r ,p a )<^ r f(t)dt, (io) 

J a 

for any r S (a, 6). Using the triangle inequality for the variational distance, 
the expression on the left hand side can be bounded by 

AilPr ,Pj _ jta < ^ _ /(r) . 
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Inequality (|10j) then follows from the second fundamental theorem of calcu- 
lus, which concludes the proof. □ 



A. 3 Auxiliary Identities 

For any n > 0, k £ [0, n], and x £ [0, 1] 



> o 




(11) 



As an immediate consequence of this expression we have 




k 



k 
n 



) 



x k (l -x) n ~ k < 



(12) 
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